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Abstract. We describe a modification of a fourth-order accurate "moving 
puncture" evolution code, where by replacing spatial fourth-order accurate 
differencing operators in the bulk of the grid by a specific choice of sixth-order 
accurate stencils we gain significant improvements in accuracy. We illustrate the 
performance of the modified algorithm with an equal-mass simulation covering 
nine orbits. 
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1. Introduction 

In the last two years the numerical solution of the general relativistic two-body problem 
has made a giant leap forward with a series of breakthroughs in 2005 [TJ [5] [3J . More 
than forty years after Hahn and Lindquist started the numerical investigation of 
colliding black holes [3], the field has now passed several crucial milestones toward 
simulating general inspiral situations, such as simulations of unequal-mass binaries and 
calculations of the gravitational recoil effect and the evolution of black-hole binaries 
with spin [5] [6] [7] [8] [9] [10] . The latter have recently lead to the spectacular finding 
that extremely large recoils are possible for spinning black holes [IT] [12l [13] . 

In order to fulfill numerical relativity's promise of providing useful information 
to the gravitational wave data analysis community, it is desirable to perform long 
numerical inspiral evolutions that allow us to cleanly match fully general relativistic 
and post-Newtonian waveforms with error bars, and thus produce "complete" 
waveforms, which contain large numbers of gravitational- wave cycles from the inspiral 
phase, as well as simulating the merger and ringdown phases. Such simulations will be 
necessary for a sufficiently dense sample of the black-hole binary configuration space. 

Comparisons with post-Newtonian results have already started, and several 
groups have published results showing good agreement of various aspects of non- 
spinning simulations with post-Newtonian predictions (see e.g. [HI [T5l [TB] [T7] HH HH 
[20]). Precise error estimates and detailed coverage of the unequal-mass and spinning 
cases are however missing. One serious technical problem is that performing such long 
evolutions with good accuracy in the phase is still computationally expensive — at 
least for standard "moving puncture" finite-difference codes [5TJ [55] [53] (TH] EH QH] • 
In order to overcome phase inaccuracies in long evolutions, an alternative route is 
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provided by spectral methods, and significant progress has been made in this direction 
by the Caltech-Cornell group [251125]. 

The initial data we evolve differ from the choice in [25J [25], and we cannot 
make a direct comparison of the accuracy of the results since puncture evolutions 
cannot start with excision initial data. It would certainly be worthwhile in the 
future to conduct direct comparisons regarding efficiency for different codes. We only 
quote some preliminary numbers here to illustrate the fact that while ultimately the 
convergence behavior of spectral codes is superior, for the purpose of "gravitational 
wave astrophysics" , finite difference codes may still be competitive: The performance 
of the Caltech/Cornell spectral code has been quoted for long time medium resolution 
runs as roughly 10 (for a "64 3 " grid configuration) and 27 ("76 3 " grid) CPU hours 
per M [26] (we will quote time and length units of the total black-hole mass M, see 
[16]). The highest resolution run presented here performs at 5.5 CPU hours per M 
on an Intel Woodcrest 2.66 GHz dual core processor. Both codes show satisfactory 
accuracy for long evolutions, and we certainly expect both codes to undergo further 
optimizations. Long evolutions that show fourth order convergence for most of the 
simulation have also been presented by the Goddard group [27] . using impressively 
low spatial resolution, but no details are given on code timing. 

We have previously reported accurate evolutions for approximately two orbits 
(initial separation of D = 6.45M) in [15] , with an error in the merger time of 0.2 % at 
a computational cost of 505 CPU hours (1.44 CPU hours/M), and we have reported a 
merger time error of 0.5% for D = 8M simulations [28]. At larger initial separations 
the number of orbits is a steep function of the separation, and phase accuracy rapidly 
decreases. Our fourth order code would thus require resolutions which we find hard 
to tolerate for performing large parameter studies in the style of [BJ. In the context 
of finite differencing it is natural to consider higher order methods. For example, it 
already turned out to be important to move from second-order finite differencing to 
fourth-order finite differencing as the feasible evolution time for puncture evolutions 
increased. 

However, using higher order finite differencing for the types of codes used to 
simulate black-hole binary inspiral is not entirely trivial: for the moving- puncture 
method, which is currently employed in the majority of the current codes, the 
continuum equations become singular at the location of the "puncture", and one 
might worry about the robustness of finite-difference schemes. Furthermore, current 
mesh refinement algorithms in the field are based on the use of buffer zones, whose 
number depends on the stencil width of the finite differencing scheme. In three spatial 
dimensions high-order finite-difference schemes with wide stencils easily lead to a 
drastic decrease in performance caused by the additional computational load due to 
the extra buffer points. This gain in accuracy pertains in particular to the phase of 
the evolution. 

In the present paper we report on a first step to significantly improve the accuracy 
of current finite-difference codes to evolve black-hole binaries by using sixth order 
accurate finite differencing operators in the bulk of the grid. We combine the sixth 
order accurate derivative operators with fourth-order accurate dissipation operators 
[setting r = 3 in Eq. ([3])] and Runge-Kutta time integrators, and aggressively reduce 
the number of AMR buffer zones compared to the number we would theoretically 
require for sixth order convergence. The penalty in computational cost is a rather 
moderate 30% compared with our fourth-order code. (We have compared the average 
speed over 100 M of evolution time for our largest grid configuration, which produces 
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a 29% increase in computational cost; in our experience this is typical for our current 
code). 

2. Summary of the "moving-puncture method" as implemented in the 
BAM code 

There is a large freedom in writing the Einstein equations as a system of partial 
differential equations, and much research has gone into finding optimal choices. In this 
work we employ the currently most popular choice, the BSSN system [29, I3T1 132) . 
which so far is the only system for which results have been reported of long-time 
black-hole binary simulations that do not rely on black hole excision as has been used 
for example in [TJ [2H [25] . 

We use the BSSN system together with the 1+log and gamma freezing coordinate 
gauges [331 [Ml IS] as described in [TB] (choosing in particular the parameter rj in 
the gamma freezing shift condition as r\ = 2 as we have done previously). These 
gauge conditions allow the "punctures" to move across the grid ("moving puncture" 
approach [21 [3]) and allow an effective softening of the singularity in the metric 
associated with an internal asymptotic region .''>•"> 30 37 . which had been prohibited 
by the traditional "fixed punctures" approach. The BSSN system is based on a 
conformal decomposition of the spatial geometry writing the physical spatial metric as 
gij = x _1 <?ij (following |2J). The blowup of the metric at the "punctures" is absorbed 
into the conformal factor %, which vanishes at the "puncture" . 

For our numerical evolutions we use the BAM [351 132] code, which is designed to 
solve partial differential equations on structured meshes, in particular a coupled system 
of (typically hyperbolic) evolution equations and elliptic equations. The complexity 
of the equations is addressed by using a Mathematica package integrated into the 
code, which produces C-code from Mathematica expressions in tensor notation. 
Using such a system as we do in BAM, or as has been discussed in detail for the 
Cactus environment in [40] drastically simplifies the modification of complex codes 
for black-hole binary simulations, as was required to adapt codes from the "fixed 
puncture" to the "moving puncture" paradigm, or in the present case to implement 
the improved numerical algorithms discussed here. The structure of the BAM code 
has also made it straightforward to implement higher order finite differencing methods 
The computational domain is decomposed into rectangular boxes, following standard 
domain-decomposition algorithms, and is parallelized with MPI |41j . Our mesh 
refinement algorithm is based on the standard Berger-Oliger algorithm, but with 
additional buffer zones, along the lines of [42j [43] as described in [16] and summarized 
in the next section. We essentially use a fixed-mesh-refinement strategy, with inner 
level refinement boxes following the motion of the black holes. Typically we use about 
10 refinement levels (refining the grid spacing by factors of 2), roughly half of which 
follow the movement of the black holes. 

In order to represent black holes in the initial data, we use the so-called "puncture 
method" [44 . For these data it is well understood how to write the constraint 
equations in a form suitable for numerical solution [45l [46] . Following the approach 
of [44] our initial data sets are chosen to be conformally flat with Bowen-York 
extrinsic curvature. The momentum parameter in the Bowen-York extrinsic curvature 
is determined from a quasi-equilibrium condition at third post-Newtonian order as 
described in [16] . The elliptic constraint equations are solved in BAM with the pseudo- 
spectral collocation code described in [47] . AMR data are then obtained by barycentric 
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interpolation, typically with eighth-order polynomials for both the fourth- and sixth- 
order finite differencing methods. The efficiency of the spectral solver is sufficient to 
solve the initial data problem on a single processor. 

3. Sixth order finite differencing 

3.1. Mesh refinement in the BAM code 

Our numerical evolution algorithm is based on a method-of-lines approach using finite 
differencing in space and explicit fourth-order accurate Runge-Kutta time stepping 
(with a fixed time step). We apply sixth-order accurate polynomial interpolation in 
space between different refinement levels so that all spatial operations of the AMR 
method (i.e. restriction and prolongation) are sixth-order accurate, such that the 
second derivatives of interpolated values are at least fourth-order accurate. Although 
the time stepping used for evolution is also fourth-order accurate through the Runge- 
Kutta integrator, there arises the additional issue of how to provide boundary values 
for the intermediate time-levels of the Berger-Oliger algorithm that are not aligned 
in time with a coarser level (otherwise spatial interpolation can be used). Using 
higher than third-order interpolation has lead to spurious noise at mesh refinement 
boundaries as described in [16j . We therefore use third-order interpolation in 
time, which introduces a second-order error within the Berger-Oliger time-stepping 
scheme |f 6j . which however is not noticeable in typical runs as we have checked by 
running with uniform (as opposed to Berger-Oliger non-uniform) time-stepping. In 
summary, if the outer boundary is placed sufficiently far away and if time- interpolation 
errors at refinement boundaries are small, then fourth-order convergence can be 
observed. 

A relatively straightforward modification of the standard Berger-Oliger scheme is 
to replace the single-point refinement boundary by a buffer zone consisting of several 
points, e.g., [42, 43, 48J. For a sufficiently large number of buffer zones (the product of 
number of points in the stencil toward the mesh refinement boundary and the number 
of source evaluations during a full time step of the coarse grid), no time interpolation is 
required and excellent results have been reported for this scheme [43] (note that special 
methods like [49] seem to achieve similar performance). For example, our fourth-order 
Runge Kutta scheme requires four source evaluations, and if the lop-sided stencil with 
three points in one direction, 

f(x) = " 3/ - X " 10/0 +^ 8/l ~ 6/2 + /3 - ^/ (5 W + 0(h*) (1) 

is used, then the numerical domain of dependence for a given point has a radius of 12 
points. Here and in the following we use the notation fj = f(x + jAx). Therefore, it 
is possible to provide 12 buffer points at the refinement boundary and to perform one 
RK4 time step with size three stencils that does not require any boundary updates. 
Only after the time step is completed, the buffer zones have to be repopulated. In 
the context of Berger-Oliger AMR, the buffer update is based on interpolation from 
the coarser levels. Since every second time step at level I coincides in time with level 
I — 1, one can provide 24 buffer points, perform two time steps, and then update the 
buffer by interpolation in space. With 12 buffer points, one can interpolate in time to 
obtain data for the buffer points at intermediate time levels. 

To use fewer than 12 buffer points, we can interpolate into all buffer points 
before starting an RK4 update as described, and then evolve all points except the 
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outermost points located exactly on the boundary, which are kept fixed at their initial 
interpolated value. The inner points next to the boundary are updated using second- 
order finite differencing for the centered derivatives and shifted advection stencils for 
the advection derivatives. Even though for large grids the number of buffer zones 
becomes negligible, for the grid sizes that we use, the buffer points affect the size of 
the grids significantly. For example, even for our largest inner box size of 80 points in 
one direction, adding six points on both sides instead of 12 or 24 points leads to 92, 104, 
and 128 points, respectively, which corresponds to a significant saving in the number 
of points in 3d since 104 3 /92 3 « 1.44 and 128 3 /92 3 « 2.69. For clarity, we always 
quote grid sizes without buffer points, because this is the number of points owned by a 
particular grid. Experimentally we have found that for the fourth-order case using just 
six buffer points leads to very small differences compared to 12 buffer points, but even 
smaller buffer zones lead to noticeable differences. For simulations with fourth-order 
accurate derivative operators, we have therefore chosen a standard setup of RK4 with 
dissipation and lop-sided advection stencils, 6 buffer points, quadratic interpolation in 
time, and Berger-Oliger time-stepping on all but the outermost grids following [16 . 

As is common in numerical relativity, we use symmetric finite-difference stencils 
for all spatial derivatives but the advection terms associated with the shift vector, 
where we use lop-sided upwind stencils, see e.g. [50] for the fourth-order accurate 
case. 

3.2. Artificial Dissipation 

In finite-difference codes targeted at smooth solutions of nonlinear hyperbolic 
equations, it is common practice to add artificial dissipation terms to all right-hand- 
sides of the time evolution equations, schematically written as 

d t u -> d t u + Qu. (2) 

Such dissipation terms are very efficient at suppressing very high-frequency waves, 
which are not part of the physical solution. This may be necessary for numerical 
stability [51j . but also to reduce numerical noise generated at mesh-refinement 
boundaries. As has become rather common in numerical relativity, we follow [51] 
and choose an operator (Q) of order 2r as 

Q = a(-h) 2r - 1 (D + y(D_y/2 2r , (3) 

for consistency with a 2r — 2 accurate scheme, with a a parameter regulating the 
strength of the dissipation. As we have done in the past with our fourth-order code 
we choose the factor a as a = 0.1 in the inner levels and a — 0.5 in the outer levels 
(where the waves are extracted). 

For high orders, these dissipation stencils become rather large (seven points for 
the fourth-order case and nine points for the sixth-order case). We therefore do not 
add dissipation terms where these stencils would "cross" mesh refinement boundaries. 
Also, adding dissipation terms with large stencils can lead to a loss of performance. 
We have therefore attempted to combine the use of sixth-order accurate stencils for 
the derivative operators with a fourth-order accurate dissipation operator and time 
integrator and second-order time interpolation at mesh refinement boundaries with an 
aggressively small number (6) of buffer points. 
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3. 3. Sixth order accurate finite difference operators 

For the sixth-order case we find that several choices for the advection term stencils 
yield stable evolutions, but the lop-sided upwind stencil which is closest to the 
symmetric case yields (probably not surprisingly) by far the best accuracy, i.e. we 
use 

rw - 2{ - 2 - 24/ - - 35/8 i! 0/l " 30/2 + 8/3 " /4 + ^P^f + ^ m 

bun 105 ax' 

Alternative asymmetric choices would be 

fl( v = -10/-! - 77/p + 150/! - 100/2 + 50/ 3 - 15/ 4 + 2/ 5 _ 1 <ff(x) 6 7 
1 [X> 60h 42 dx 7 [ h 

,,, x -147/p + 360A - 450/ 2 + 400/3 - 225/4 + 72/ 5 - 10/ 6 ) , l rf7(a) , a , n ,, 7 s 

/(X) = 60^ + 7^^ + ° {h } ' 

for the stencils that deviate more from the symmetric choice. We can see that the 
first choice has the smallest leading error term. The symmetric stencil has an even 
smaller error term, 

f(x) = -/3 + /-a-9(/ 2 -/- 2 )+45(/i-/-i) _ 

J V ' 60h 140 dx 7 V ' 

but does not show equally robust results, as is common for solving advection equations. 
For non-advection derivative terms we again use the standard symmetric stencil, 
similarly for second derivatives in one direction we use the symmetric stencil 

f"(x) -490/Q + 270(/i + - 27(/ 2 + /_ 2 ) + 2(/ 3 + /- 3 )) 1 m , )6 , 8) 
1 [ ' ~ 180/! 2 560 ; [ ' + { >■ 

For mixed derivatives, we use the stencils which result from a product of the symmetric 
sixth order accurate first derivative operators. 



4. Results for long equal-mass evolutions 

All runs are carried out with the symmetry (x, y, z) — » (— x, —y, z) and (x, y, z) — > 
(x, y, — z), reducing the computational cost by a factor of four. The Courant factor 
C = At /hi is kept constant, and is set to C = 1/2 for the inner grids, while for 
the outer grids at levels 0-4 the time step is kept constant at the value of level 3, 
following our previous work [16] . All runs presented here use six AMR buffer points, 
the same number that we have used for our fourth-order accurate code [16 . We 
stress that this is less than required to isolate the fine level "half" timestep from 
time interpolation errors at the mesh-refinement boundary, and in particular also less 
than required for a fully sixth-order scheme following the approach of [43] . The grid 
setups we have used for our simulations are displayed in table [1] (using the naming 
convention introduced in [16j). All the runs listed here have been performed on the 
Kepler cluster at the University of Jena (using Intel dual Woodcrest CPUs running 
at 2.66 GHz and an Infiniband interconnect), additional runs have been performed at 
LRZ Munich and the Doppler cluster at the University of Jena. We will denote the 
individual simulations by the inner-box size, i.e., 48, 56, 64, 72 or 80, as indicated in 
bold in table [TJ 

Our initial data are chosen as follows: the initial coordinate separation of the 
punctures is chosen as D = 12M, the horizon mass for each individual hole is chosen 
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Table 1. Grid setups used for convergence test simulations. The notation in the 
"Run" column is the same as we have used in 1 161 . The quantities h m i„ and h max 
(rounded to three digits) denote the finest and coarsest grid spacings, and r max is 
the location of the outer boundary (rounded to 4 digits), and all are in units of M. 
Also specified are the numbers of processors used, maximal memory requirement 
in GByte (to be precise, we quote the resident size of the program, i.e., the physical 
memory a task has used), and average speed in Af/hour for the Kepler cluster at 
the University of Jena (using Intel dual Woodcrest CPUs running at 2.66 GHz). 
The number in bold are used to indicate individual simulations throughout this 
paper. 



Run 




hmin 


hmax 


'max 


procs. 


mem. (GByte) 


M/hour 


Xr,=2[5 x 48 : 5 x 96 : 


6] 


1/32.0 


16 


776.0 


8 


12.2 


15.6 


Xr,=2[5 x 56 : 5 x 112 


:6] 


1/37.3 


96/7 


774.9 


12 


18.2 


11.6 


X^ =2 [5 x 64 : 5 x 128 


:6] 


1/42.7 


12 


774.0 


12 


22.5 


8.4 


Xr,=2[5 x 72 : 5 x 144 


:6] 


1/48.0 


32/3 


773.3 


16 


31.3 


3.5 


Xr,=2[5 x 80 : 5 x 160 


:6] 


1/53.3 


48/5 


772.8 


24 


45.4 


4.4 




-6 -4 -2 2 4 t/M 

x/M 



Figure 1. Left panel: Coordinate tracks of the puncture location of one black hole 
for simulations {64,72,80}. Only in the last few orbits are differences between 
the three runs discernable. Right panel: the waveform plotted as the real part of 
rh,22, as defined in |20| , 



as mi = 0.5M, which corresponds to puncture mass parameters of m p — 0.488M. The 
initial momenta are obtained as p = ±0.0850M from a 3PN-accurate quasicircularity- 
condition as in [16] . 

Our algorithm for gravitational wave extraction in terms of the Newman-Penrose 
scalar vE^ has been described in [IB]. It is useful to write the signal in terms of a 
time-dependent amplitude and phase as 

* 4 = A(i)e iv W, 

and define the gravitational wave frequency as u = (p. 

The coordinate tracks of the puncture locations are shown in figure [T] for 
the simulations for which we have obtained sixth-order convergence in the phase, 
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Figure 2. Left panel: coordinate distance of the black holes for the fourth order 
version of the 48-configuration and sixth-order simulations {48, 64, 80} in the order 
of increasing merger time. Right panel: the gravitational wave phase for the same 
runs. The 72 simulation would not be distinguishable from the 80 simulation on 
the scale shown here. 



1.5 



0.5 



((64) -(72))/2. 19244 
(72) - (80) 
conv. fact. 

err. estimate , 




l^ ; ",.ii'f : s;v!';--"^ — 

il M 



800 1000 1200 1400 1600 1800 
t/M 




800 



1000 



1200 1400 
t/M 



1600 



1800 



Figure 3. Convergence test for the gravitational wave phase. Plotted are the 
difference between the 72 and 80 runs, and the difference between the 64 and 72 
runs rescaled for sixth-order convergence. Also shown is the convergence factor 
divided by 6, which shows a "glitch" around the time that the phase increases 
very sharply, and the error estimate after performing Richardson extrapolation. 
The left panel shows a linear scaling, the right panel shows the same plot 
with a logarithmic scaling to emphasize the slow but clean exponential growth 
Sip = 0.0117 exp 0.0034/M of the phase error at intermediate times. 



{64,72,80}. Only during the last two orbits are differences between the three runs 
distinguishable. The orbital tracks show roughly 9 orbits before merger, and for the 
gravitational wave we obtain roughly 26 cycles before the ringdown signal becomes too 
noisy at t « 1960AT . The right panel of figure [I] shows the real part of the I = 2, m = 2 
mode of the rescaled strain rh-xi ( $4 = h, compare |20|). 

Figure [5] shows the coordinate distance and gravitational wave phase for the black 
holes for the {64, 80} simulations (the 72 simulation would not be distinguishable from 
the 80 run on the scale shown here). We find that lower resolutions merge earlier, and 
this is systematic for all the runs we have performed. For the {48, 64, 72, 80} runs we 
obtain "merger times" of t = (1746.8, 1790.7, 1797.5, 1800. 8)M, and the maximum of 
the radiation power is reached at times t = (1818.5, 1862.2, 1869.0, 1872.4) Af . "Merger 
time" here is understood only as a rough indicator of the time of merger, which is 
used for convergence tests, and is defined as the time when the coordinate distance 
of the punctures drops below 0.5M. For the {64,72,80} runs the merger times show 
convergence at order 5.55, the radiation peak times show convergence at order 5.4. 
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Figure 4. The I = 2, m = 2 mode of the wave signal is split into the absolute value 
of ^4,22 (left panel) and the wave frequency uj (right panel). Both panels show 
the simulations {64, 72, 80}, aligned in time to coincide at the peak of \^a\. The 
curves are clipped at early times, where they are very noisy due to the smallness 
of the signal and finite differencing error in computing the wave frequency from 
the phase. 
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Figure 5. Convergence plot for the wave amplitude ^422 m * ne ' = 2,m = 2 
mode. Both panels show the difference between the 72 and 80 runs and differences 
between the 64 and 72 runs rescaled for sixth-order convergence. In the left panel 
data at different resolutions are compared at the same coordinate time, which 
leads to a seeming loss of convergence near the radiation peak, which is due to 
the relatively large phase error. In the right panel the data are compared at 
the same value of the gravitational wave phase, which restores clean sixth order 
convergence. 



Richardson extrapolation with convergence order 5.5 yields an error estimate of « 4M 
for both times. Note that oscillations, which are probably mainly due to eccentricity, 
can clearly be seen in the black hole distance, and also in the wave amplitude shown 
in figure 0J A method to reduce eccentricity will be discussed in [52] . 

We have obtained roughly sixth order convergence for the gravitational wave 
phase between about t = 1000M and t — 1800M, as shown in figure [3] At earlier 
times the convergence factor becomes very noisy due to the smallness of the signal. 
Shortly before the merger the convergence factor "glitches" to a value of roughly 7. 
This problem can also clearly be seen in the convergence of the radiation frequency 
and amplitude as shown in figures [6] and [5j This "glitch" appears when the frequency 
and phase increase very sharply, and small phase errors have a large effect. The 
logarithmic scaling version of the convergence plot in figure [3] shows a slow and rather 
clean exponential growth for the phase error at intermediate times, a nonlinear fit 
for 300M < t < 1400M yields 6tp = 0.0117 exp(0.003t/M) for the phase error. This 
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Figure 6. Left panel: Reparametrisation of the black hole distance by 
gravitational wave phase yields clean sixth order convergence until the time of 
merger (which occurs roughly at the peak of the error) - the difference between 
the 72 and 80 runs and differences between the 64 and 72 runs rescaled for sixth- 
order convergence lie essentially on top of each other. Right panel: The difference 
in wave phase between the 4th and 6th order versions of the lowest resolution 48 
configuration shows the fourth-order algorithm "falling behind" . 



observation provides one way to optimize numerical methods in the inspiral phase, 
without evolving all the way to the merger. In the last stage of the inspiral the 
phase error grows very rapidly. We have noted this previously in a different context 
in [16j . where we have compared different methods to provide quasicircular inspiral 
data. Small changes on the order of 1% of the initial momenta have lead to drastic 
changes of rs 40M in merger time. In order to clarify the convergence behaviour 
of our code, we have applied a new technique to check for convergence in situations 
where the numerical error is dominated by phase shift: We first perform a convergence 
analysis for the dependence of the gravitational (or orbital) phase on code time, and 
then perform a standard convergence test on a quantity like the puncture separation 
(Fig. ©) or wave amplitude (Fig. ©) regarding the functional dependence of this 
quantity on the phase. 

An important question aside from convergence is how the sixth order and fourth 
order algorithms compare in absolute numbers. For this purpose we have re-run the 48 
configuration with our standard fourth order algorithm. We find that already at this 
low resolution the sixth order algorithm is superior, as shown in figures [2] and [6l while 
at higher resolutions the larger convergence factor increases the gain in accuracy. 

5. Conclusions 

We have described a minimal extension of the fourth-order accurate evolution 
algorithm described in [16j . where by replacing spatial fourth-order accurate 
differencing operators in the bulk of the grid by sixth-order accurate stencils, we 
gain drastic improvements in accuracy for the phase in long simulations of equal- 
mass inspiral. The crucial technical point regarding the choice of sixth-order accurate 
finite difference operators has been a specific choice for the advection stencil, which is 
used to discretize Lie derivative terms with respect to the shift vector in the Einstein 
equations. Using this method we have demonstrated evolutions of about nine orbits or 
1800M with a phase error of approximately 4M in the merger time, requiring « 11100 
CPU hours on our in-house cluster. We emphasize that our code has several lower- 
order accurate ingredients, which however do not seem to contribute significantly to 
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the numerical error at the resolutions we employ. 

Our emphasis here has been on boosting the current generation of "moving 
puncture" codes regarding their efficiency to analyse physical situations that require 
long evolutions, such as an accurate comparison with post- Newtonian results (sec [53 ), 
rather than on numerical analysis. Some technical questions certainly remain, such as 
the reduction of the numerical error at mesh refinement boundaries, the optimization 
for different architectures, and a rigorous mathematical analysis of numerical stability, 
e.g., by extending [54] to the BSSN system and the complications arising in the context 
of mesh-refinement. In future work we will present applications of our algorithm to 
other situations such as unequal masses and evolutions of spinning black holes. 
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